sage-attention: avoid torch custom_op for sm100 and add benchmark by drbh · Pull Request #452 · huggingface/kernels-community

drbh · 2026-03-06T18:15:13Z

This PR

avoids the @torch.library.custom_op in sm100_compile.py similar to the sm90_compile.py file
adds a benchmark file
adds a readme example for simple validation script

… readme example

drbh · 2026-03-06T18:29:41Z

note these changes may resolve the double registration issue seen on B200's

test with

# /// script
# dependencies = [
#   "numpy",
#   "torch",
#   "kernels",
# ]
# ///
import torch
from kernels import get_kernel

torch.manual_seed(42)
sage_attention = get_kernel("drbh/sage-attn-test", version=2)

device = "cuda"
B, H, L, D = 1, 8, 256, 64
q = torch.randn(B, H, L, D, dtype=torch.bfloat16, device=device)
k = torch.randn(B, H, L, D, dtype=torch.bfloat16, device=device)
v = torch.randn(B, H, L, D, dtype=torch.bfloat16, device=device)

out = sage_attention.sageattn3_blackwell(q, k, v)
print(f"sageattn output shape: {out.shape}")

sage-attention: avoid torch custom_op for sm100 and add benchmark and…

e47f639

… readme example

drbh requested a review from danieldk as a code owner March 6, 2026 18:15

danieldk approved these changes Apr 20, 2026

View reviewed changes

drbh merged commit 7535f60 into main May 19, 2026
7 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sage-attention: avoid torch custom_op for sm100 and add benchmark #452

sage-attention: avoid torch custom_op for sm100 and add benchmark #452
drbh merged 1 commit into
mainfrom
update-sage-attn-ops

drbh commented Mar 6, 2026

Uh oh!

drbh commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

drbh commented Mar 6, 2026

Uh oh!

drbh commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants